Automated deep learning-based AMD detection and staging in real-world OCT datasets (PINNACLE study report 5)

Real-world retinal optical coherence tomography (OCT) scans are available in abundance in primary and secondary eye care centres. They contain a wealth of information to be analyzed in retrospective studies. The associated electronic health records alone are often not enough to generate a high-quality dataset for clinical, statistical, and machine learning analysis. We have developed a deep learning-based age-related macular degeneration (AMD) stage classifier, to efficiently identify the first onset of early/intermediate (iAMD), atrophic (GA), and neovascular (nAMD) stage of AMD in retrospective data. We trained a two-stage convolutional neural network to classify macula-centered 3D volumes from Topcon OCT images into 4 classes: Normal, iAMD, GA and nAMD. In the first stage, a 2D ResNet50 is trained to identify the disease categories on the individual OCT B-scans while in the second stage, four smaller models (ResNets) use the concatenated B-scan-wise output from the first stage to classify the entire OCT volume. Classification uncertainty estimates are generated with Monte-Carlo dropout at inference time. The model was trained on a real-world OCT dataset, 3765 scans of 1849 eyes, and extensively evaluated, where it reached an average ROC-AUC of 0.94 in a real-world test set.


Grading protocol
The grading on OCT volume level was performed using an in-house developed web application designed to grade OCT volumes in an efficient manner with the help of hotkeys and digital caliper capabilities.

Gold-standard labels:
The validation and test data were graded by a retinal expert using the following criteria to label a B-scan: • MNV: presence of either PED or SHRM together with either IRF or SRF, based on Metrangolo et al. 45 .
• DRUSEN: presence of at least one drusenoid elevation of the RPE.
• MA: cRORA as defined by presence of all the following three features in an axially overlapping manner over an extent of more than 250µm: choroidal hypertransmission, RPE attenuation or loss, evidence of overlying photoreceptor degeneration, based on Sadda et al. 13 .
Silver-standard labels: The training data was graded by an experienced non-medical grader with relaxed rules as follows: • MNV: a clear presence of either IRF or SRF.
• DRUSEN: a clear presence of at least one drusenoid elevation of the RPE.
• MA: a clear presence of atrophy as defined by presence of all of the following three features in an axially overlapping manner over an extent of more than 150µm: choroidal hypertransmission, RPE attenuation or loss, evidence of overlying photoreceptor degeneration.
• NORMAL: healthy appearing retina without visible distortions and deformations of the retinal layers.
As soon as one B-scan of a volume was graded into one of the biomarkers MNV, DRUSEN, and MA the whole volume was assigned this label.The grading was independent for biomarker, i.e. a volume can simultaneously have the labels MNV and DRUSEN.NORMAL label was determined by the absence of these three disease-related biomarkers.

B-scan-level classification
Training set A total of 106,892 2D B-scans acquired with Spectralis OCT (Heidelberg Engineering, Heidelberg, Germany) from a publicly available dataset 26 , supplemented with 7,829 B-scans showing atrophy from an internal dataset, were used to pre-train the B-scan model.This dataset is denoted as KERM.The Topcon training set for the B-scan classifier itself, consisted out of 2,967 B-scan slices from 1,059 OCT volumes from 358 patients and 372 eyes extracted from the volume level training set of MDS described above.The B-scans were graded by an experienced non-medical grader into MNV, DRUSEN, MA and NORMAL and can therefore be considered silver standard labeled.Please refer to Table S2 for more details.
Validation and Test set The B-scans for the validation and test set were extracted in a deterministic way (5 B-scans per volume on percentage y-positions 30%, 46%, 50%, 54% and 70%) from the respective volume validation and test set of MDS and graded by a retinal expert for the presence/absence of the four biomarkers (MNV, MA, DRUSEN and NORMAL).Please see Table S2 for the distribution of biomarkers on these B-scans.

Results
The classification performance at the B-scan level is reported in Table S3.Examples of B-scans from PINN test set and their classification is shown in Figure S2. Biomarker

Figure S2 .
Figure S2.Example B-scan classification on the test set.B-scan number and the ground-truth label are displayed on top, while the softmax output (%) for each class is displayed below.Each row corresponds to a different ground-truth biomarker presence.
Figure S3.Data diagram: Shows the combination of datasets and the selection used for training, validation and testing of the network.Orange is used as color-code for labels graded by an non-retinal expert, red for only electronic health record labels and green for retinal expert graded labels.
3D models in the task to detect the presence or absence of the four biomarkers MNV, NORMAL, DRUSEN and MA at a volume-level in PINN dataset.Columns: ROC-Area under the curve (ROC-AUC), balanced accuracy (BACC), accuracy (ACC), Matthews correlation coefficient (MCC), F1 Score, sensitivity and specificity per model.Bold marks the highest value.

Figure S4 .
Figure S4.Comparison between B-scan predictions only for different %-thresholds of late stage predicted B-scans, expert graded central B-scan only prediction and our 2-stage approach on the whole test set.MCC on the y-axis and different %-thresholds on the x-axis.

Figure S5 .
Figure S5.Example of conversion grading timeline (top) and OCT volume viewer with biomarker B-scan labels (bottom).Legend for the B-scan labels: Yellow: MNV, red: MA, green: NORMAL, light green: DRUSEN.

Table S2
. B-scan characteristics in the development set (training and validation sets) (left) and test set (right).

Table S3 .
ResNet50 classification performance on the B-scan level test set.

Table S4 .
Comparison of our approach with two different